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Abstract 

This paper addresses the two problems of flow and density reconstruction in Road Transportation 
Networks with heterogeneous information sources and cost effective sensor placement. Following stan¬ 
dard macroscopic modeling approaches, the network is partitioned in cells, whose density of vehicles 
changes dynamically in time according to first order conservation laws. The first problem is to estimate 
the flow and the density of vehicles using two sources of information, namely standard fixed sensors, 
precise but expensive, and Floating Car Data, less precise due to low penetration rates, but already 
available on most of the main roads. A data fusion algorithm is proposed to merge the two sources of 
information for observing density and flow of vehicles. The second problem is to place the sensors in 
the network by trading off between cost and performance. A relaxation of the problem is proposed based 
on the concept of Virtual Variances. The efficiency of the proposed strategies is shown on a synthetic 
regular grid and in the real world scenario of Rocade Sud in Grenoble, France, a ring road 10.5 km 
long. 

Index Terms 

Road Transportation systems; Dynamical flow network; Density reconstruction; Floating Car Data; 
Optimal Sensor Placement. 

I. Introduction 

The last decades have witnessed a considerable increase of the number of vehicles especially 
as consequence of urbanisation in big metropolis, not matched by a comparable extension of 

The authors are with Univ. Grenoble Alpes, Gipsa-Lab, with CNRS, Gipsa-Lab, and with Inria Grenoble Rhone-Alpes. F-38000 
Grenoble, France enrico . lovisari, carlos . canudas-de-wit, ala in . kibangou@gipsa-lab . f r 


July 28, 2015 


DRAFT 



2 


road infrastructures. As a consequence, crucial freeways, highways and arterial roads have been 
steered to a state of near saturation, and experience on daily basis periods of congested traffic 
[1]. In turn, congestion causes increased travel times and stop-and-go phenomena, leading to 
decreased safety, economical losses, and environmental and psychological hazards in terms of 
pollution and road rage [2]. Increasing road capacity by extending road infrastructures, such as 
construction of new arterial roads, has been the standard way to cope with congestion problems, 
but that is often infeasible when existing roads lie on built-in areas. Intelligent Transportation 
Systems (ITSs), on the contrary, exploit recent technological and theoretical advancements in 
distributed computation and communication, and are expected to provide robust techniques for 
real-time monitoring, prediction and actuation of traffic networks, and to better integrate with 
road and rail public transportation. 

The first goal of the present paper is addressing the problem of estimating road usage in terms 
of density and flow of vehicles in a traffic network. The latter are commonly considered a good 
representation of the state of the system, providing more information than average speed alone. 
In particular, they are of crucial importance for 1) forecasting travel time and traffic evolution, 
along with historical data; 2) informing in real-time drivers about the state of the network through 
navigation systems; 3) providing public authorities with statistical data to monitor the state of 
the network and predict dangerous scenarios; 4) computing and actuating control actions through 
traffic lights, ramp metering and speed limits, or, in the future, lane change and semi-autonomous 
routing and navigation [3], [4], [5], [6]. 

Standard devices to obtain information on the state of the network are fixed sensors such as 
induction loops and magnetometers. Placed over a section of road, they provide rich information 
on the vehicles that cross such a section over a pre-fixed period of time: 1) their number, or 
flow, 2) their average speed, and 3) their average density, or more precisely their occupancy 
(see Section 0 Current technology allows for very precise measurements, with relative errors 
of measured quantities against ground truth often being below 1 ~ 2 %. However, deployment 
and maintenance of a sensing network requires considerable investments and human force, and 
consequently sensing networks are usually designed to be as sparse as possible. The second 
problem addressed in this paper is indeed the Optimal Sensor Placement problem, that is, 
positioning sensors on the cells of the network given partial information on the system and 
in such a way to trade off between performance and cost. 
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Recently, the spread of wireless devices allows sensing and communication capabilities un¬ 
foreseeable up to few years ago. Limiting the attention to traffic applications, vehicles equipped 
with positioning devices (such as GPS) and able to communicate with an ITS monitoring system 
can act as a probes in the traffic and provide Floating Car Data (FCD), namely, information on 
the vehicles’ positions and speeds. The collected data can be used to estimate the speed in the 
network, thus offering a second source of information. Due to privacy reasons, single vehicles 
traces are usually not directly used, but rather aggregated as average speed of vehicles in segments 
of road. Advanced methodologies ensure fine spatial partitions of the network, with segments as 
short as 250 meters [7]. Compared to fixed sensors, a service based on this technology can only 
make use of information coming from her customers, which are a fraction of the total vehicles 
on the road (the penetration rate of the system). This implies that speed measurements are 
less precise and flow measurement are unavailable. On the other side, since it exploits existing 
communication systems it is relatively inexpensive and, more important, already covers all major 
traffic networks. 

In the first part of this paper we propose an algorithm that aims to reconstruct the traffic 
density and flow by fusing fixed sensors measurements and Floating Car Data. We employ a 
macroscopic model, partitioning the network in cells and assigning to each cell a density of 
vehicles. The latter evolves dynamically according to a first order mass-conservation law. 

Traffic models date back to the first half of the XXth century. The most celebrated macroscopic 
model is the PDE based Lighthill-Whitham and Richards (LWR) model [8], which, based on fluid 
kinematics, is able to reproduce crucial phenomena such as traffic shock waves. Discretization of 
the LWR-PDE is not straightforward but stable numerical schemes have been proposed, the most 
well known being the Cell Transmission Model (CTM) [9], [10]. Huge efforts have been put in 
the last 15 years to calibrate the CTM [11] and to unveil its system-theoretical properties [12]. 
Fusion of flow, density and speed measurements has also been addressed, even though mostly 
considering single vehicles traces. Approaches range from signal processing techniques such as 
the generalized Treiber-Helbing filter [13], nonlinear versions of the Kalman filter in the context 
of Lagrangian sensing [14], and stochastic versions of the three-detector model [15]. Recent 
approaches do not rely on discretization of the LWR-PDE model and allow to cast problems of 
estimation and control as convex problems [16]. 

We inherit from the CTM the assumption that the inflow in a cell is a fixed linear combination 
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of the outflows of the preceding cells. Differently from CTM, however, inflows and outflows in 
all the cells are estimated on the basis of the available flow measurements only. In addition, using 
the concept of Fundamental Diagram and the speed measurements, we compute an instantaneous 
(namely, only based on the latest available measurements) pseudo-measure of the density. These 
quantities are then the inputs for the density observer. Finally, we propose a gradient descent 
method to calibrate the Fundamental Diagram. 

In the second part of the paper we address the problem of Optimal Sensor Placement, namely, 
the problem of finding the best location where to physically place sensors. This is based on 
trading off between two contrasting objectives: the first, to maximize the performance of state 
reconstruction; the second, to minimize the total economic cost of the network. 

The performance of the state reconstruction is usually related to the ability to properly estimate 
the density of vehicles in the roads. Unfortunately, nonlinearity and complexity of traffic systems 
make it hard to evaluate the performance of nonlinear observers. In order to simplify the setting, 
we consider the related problem of reconstruction of flows in a static setting. In particular, we 
consider as performance metric the error covariance of an estimator of the cumulative flows in 
the network over a long period of time. The Optimal Sensor Placement problem can be then 
seen as trading off between the performance of such a flow estimator, and a cost that depends on 
the dimension of the sensing network. Since this is a combinatorial problem, we relax it using a 
method that we call Virtual Variance algorithm, based on the idea to associate to each sensor a 
virtual variance which is large when the sensor is not relevant for good reconstruction of the flow 
vector. The only input that the algorithm needs is the matrix of splitting ratios, that prescribes 
how vehicles split at each junction, and the nominal variance of each sensor. Furthermore, we 
discuss in detail two extensions of the proposed algorithm dealing with important scenarios. In 
the first, Optimal Sensor Placement with geographical constraints, we address the scenario in 
which sensors cannot be placed in a subset of cells of the network. In the second, Optimal Sensor 
Placement with Number of Sensors constraints, we deal with the case in which the maximum 
number of sensors is pre-specified, for example due to budget limitations. 

Optimal Sensor Placement is an ubiquitous problem that has received a high degree of attention 
in several communities due to its importance for netwFork design. In Transportation Systems, it is 
of interest both in the dual-problem of best placement of hubs for cost-efficient transportation of 
goods and people [17] and Origin-Destination coverage [18], [19]. In these works, and differently 
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from the present paper, the problem is cast as a mixed integer problem which corresponds to 
determine the minimal set of locations from which the flows on the whole network can be 
determined, and sensor measurements are assumed to be perfect. 

To summarize, the contributions of this paper are the following: 1) we formalize the problem 
and we design an easily implementable approach to data fusion of fixed sensors measurements 
and Floating Car Data; 2) we propose a gradient descent calibration algorithm of the underlying 
macroscopic model; 3) we formalize the problem of Optimal Sensor Placement in terms of 
positions of sensors in a network when sensors are noisy and we provide an approximate solution 
using the concept of Virtual Variances; 4) we show the prowess of the devised Optimal Placement 
procedure on a regular grid, for which we offer a comparison between the solution found with 
our approach and the true optimal placement, found by exhaustive search; 5) we illustrate the 
performance of the Optimal Placement procedure and of the Reconstruction algorithm through 
extensive numerical experiments in the real-world scenario of Grenoble Traffic Lab (GTL) [20], 
a sensing network deployed along the freeway “Rocade Sud” in Grenoble, France, with FCD 
provided by INRIX, one of the most well known traffic solutions companies. 

The remainder of the paper is organized as follows: after setting up the notation, Section |TT] 


describes the model for a Road Transportation Network. Section III formulates the problem of 
flow and density reconstruction and describes the proposed nonlinear observer, while the problem 
of optimal sensor placement is formulated and a solution based on the heuristic Virtual Variance 
algorithm is presented in Section IV Finally, Section [V] illustrates the solutions on a regular grid 
and on the real world scenario of the freeway Rocade Sud in Grenoble, and Section [VI] draws 
the conclusions and provides several future research directions. 


A. Notation 

The symbols W 1 , M" and M nxm denote the sets of real valued vectors of dimension n, of 
positive real valued vectors of dimension n, and of real valued matrices of dimension n x m, 
respectively. The symbol M' 4 (M+) with A a finite set is to be interpreted as the set of real 
vectors (real positive vectors) indexed by elements of A. The transpose of /l e M nxm is denoted 
A T . For a vector x 6 M", x > 0 is meant component-wise. The symbol / denotes the identity 
matrix of suitable dimensions. |^4| denotes the cardinality of the set A. 

A graph Q is a pair (V. 8) where V is called the set of nodes and £ the set of edges. The 
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functions t : £ —* V and h : £ —> V tell, for each edge e, which node is its tail and which node 
is its head, namely e = (h(e),t(e)). For e G £, denote by £f := {j G £ : h(e) = t(j)} and 
£~ := {) G £ : h(j) — t(e)} the set of edges that follow or precede e, respectively. A path of 
length n > 2 is a sequence of edges e 1} ... ,e n such that e i+1 G £+ for all z = 1,..., n — 1. A 
path of length 1 is a path made of a single link. The matrix L G M VxV is a sublaplacian of Q 
if L e j > 0 only if (e, j) G £, e j , and JA L ej < 0. 

II. Road Transportation System Model 

We adopt a macroscopic approach by partitioning the lanes of the roads in a traffic network 
in cells. Cells that lie on the same section of a road and on different lanes are said to be parallel 
one each other. We interpret each cell as an edge e G £ in a graph Q = (V. £), which models 
the whole network. Here, nodes v £ V represent junctions, at which many roads intersect, or 
sections of a road. Among the set of cells £, we denote by TV and 1Z° the set of onramps and 
offramps, respectively. In this paper, we shall call onramp (offramp) any gate, be it a real ramp, 
a connector, a secondary road, etc, which lets vehicles enter into (exit from) the network. 

We make the following connectivity assumption, which formalizes the mild requirement that 
any cell can be reached from an onramp and that vehicles from any cell can exit from the 
network. 

Assumption 1. For any cell e in £, there is at least one onramp j G TV and one offramp k G 1Z° 
such that e is an edge of the path from j to k. 

Time is discrete and slotted in intervals of duration T > 0. On each cell e . G £, denote by 
p e (t) the density of vehicles, in number of vehicles per kmQ at time t, and let p(t) = [p e (t)] e£ c, 
which we call the state of the network. The density of vehicles in a cell changes dynamically 
in time according to the following mass-conservation first-order model 

Pe(t+l)=ftW + i(/i"(f)-/,”*(()), Vee£ (1) 

where i e is the length of cell e, and ffU) and /° ut (t) are the inflow and the outflow at cell e 
during the t-th time slot. 

'While we employ SI units for simplicity and for coherence with the data in the Grenoble Traffic Lab, the presentation of 
our results would obviously be unchanged if other systems of measurements, such as the imperial system, were used. 
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To relate inflows and outflows we resort to the standard concept of splitting ratios. Indeed, 
denote by R ek > 0 the fraction of vehicles that turn into cell k when they exit from cell e, 
which is the splitting ratio of e towards k. Clearly, R ek = 0 if e and k are not consecutive, and 
R e k = 1 , if e ^ TZ°. From this moment on, we make the following assumption. 


Assumption 2. The set of splitting ratios {R e j} (e,j)&SxS is known. 


Splitting ratios establish the relation f™(t ) = Yhj^s Rjef° ut (t), for all t > Q and for any cell 
e TV, while ffit) = A e (t) for e G TV, where \, {t) is an exogenous external inflow. We set 
for sake of convenience A e (t) = 0 for e f TV. By stacking inflows and outflows into vectors 
f m (t) and / out (f), respectively, we can rewrite the previous relation in matrix form as 

/"*(*) = R T r‘(t) < 2 ) 


where the matrix R = [R-e]] e e£,]e£\n i is the matrix of splitting ratios. In the present paper it is 
assumed that the matrix of splitting ratios is fixed, predetermined, and known. Its calibration, 
closely related to the estimation of Origin-Destination pairs, can be performed on single-lane 
freeways with onramps and offramps by taking ratios of flows on main line and ramps [11]. We 
plan to extend this setting to networks by casting the problem as an optimization problem in 
future research. 


A. On modelling of cell flows 

Macroscopic models such as the CTM postulate that the flow /° ut (t) that exists from cell e 
at time t is a deterministic function of the densities in the cells around e. In the simplest case, 
where e and j are two consecutive cells and Ef = {j} and Ef = {e}, then in the CTM 

/ e ° Ut (f) = min {d e (p e (t)),Sj(pj(t))} 

where d e (p e (t )) and Sj(pj(t )) are the demand of cell e and the supply of cell j, and represent 
the maximum outflow from e and the maximum inflow into cell j, respectively. The resulting 
system is a Godunov scheme for discretization of the LWR-PDE model, and can be extended 
to the network case in various ways [10], [21], [22]. While these models reproduce important 
phenomena that must be taken into account when modelling and controlling traffic networks, 
such as the movement of shockwaves, each of them can only partially represent traffic dynamics 
in networks. 
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For this reason, we will avoid to explicitly model the relation between flows and density, and 
we will limit to the standard 

fe^ = PeVe, Ve G S (3) 

namely that the volume that exists from a cell in a period is proportional to the density of 
the vehicles in the cell, and to their speed. Again, we leave unmodelled the relation between 
these two quantities, because, as we will make clear in the following, we assume to have a direct 
measurement of the average speed in each cell (in the form of Floating Car Data). In conclusion, 
we consider from now on the dynamics of the real system to be dictated by Eqs. 0-0. where 
v e , e E S, is an unmodelled quantity which depends on the local state of the network. 

While in the model of our network we do not use an explicit relation between flows and 
densities, we shall use it for data fusion and estimation purposes. In particular, we write 

<Pe = </?e(p e ),Ve e £ (4) 

where tp e is the flow of vehicles at the sensor locations, which we shall always assume to the 
at the end of the cell. The graph of the function <^ e (-), which is the Fundamental Diagram on 
cell e, is a concave function with 99 ( 0 ) = = 0, where p jam is the jam density, at which 

vehicles are too close one each other to move (values for pj am vary from 150 to 300 vehicles 
per km). 

B. Available measurements 

In this paper we consider two heterogeneous sources of information: flow and density mea¬ 
surements from sensors and Floating Car Data. 

1) Flow and density measurements: Standard measurement devices for traffic are loop detec¬ 
tors or magnetometers, radar traffic detectors, or video detection systems. They are positioned at 
fixed and predefined positions in the network, monitor a section of road, and are able to detect 
and assign a timestamp to the event “a vehicle crossed the section”, and information is then 
aggregated in time slots. For sake of simplicity and without loss of generality, we assume that 
such time slots correspond to the time discretization of the system <[Tj). As such, measurements 
of the following two quantities are then available at every time slot of duration T : 

. Flow of vehicles (p e (t), by counting the number of vehicles crossing the section during the 
t-th time slot; 
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. Density of vehicles p e (t) over the section. The quantity that is actually measured by the 
aforementioned devices is the occupancy of vehicles o e (t), defined as the percentage of 
time any vehicle was standing over the section in the t-th time slot. However, occupancy is 
approximatively related to the density of vehicles by the relation p e ~ —, where f ave is 

the average length of a vehicle in km. For this reason, we shall assume from now on that 
devices can measure density directly. 

Measurements hardly correspond to the real value of the quantities that they represent, with 
sources of noise ranging from temporary inability to detect changes of the magnetic field, too 
fast or too slow vehicles, blurred videos, etc. We adopt a simple additive noise model 

<p?(t) = <p e {t)+u>t{t), ee£ m 
p?{t) = p e {t)+u>>{t), ee£ m 

where <p™(t) and p™(t) are flow and density estimates at time t, and u£(t) and cu£(t) are 
measurement errors whose stochastic properties depend on the performance of the sensor as 
well as on road and weather conditions, and £ m C £ is the set of cells equipped with sensors. 
Due to installation and maintenance costs, usually \£ m I « |£|- 

Remark 1 . It should be mentioned that measurement systems based on magnetometers can be 
used to measure the average speed of vehicles crossing the section they monitor, in addition to 
flow and density, simply by deploying them in pairs monitoring sections that are at a fixed and 
known distance. The two consecutive instants at which the same vehicle crosses the two sections 
provides then a measurement of its speed. This type of installation is however not standard and 
more expensive due to additional hardware and software. For this reason, and to show that our 
approach does not need this additional information, we shall assume that no speed measurement 
from static sensors is available. 

2) Speed measurements: As already mentioned, recent technological advancements provide 
ITSs and companies selling traffic solutions with speed measurements in the form of Floating 
Car Data (FCD), averaged over predefined sections of roads for privacy reasons. While some 
classes of vehicles, such as taxi and buses, can indeed be traced, we decided not to use such an 
additional information in this paper, and leave the possibility to use it as future research. 

Floating Car Data are less expensive and require less maintenance effort with respect to fixed 
sensors since they exploit existing communication architectures. For the same reason, they are 
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FCD segment s(e) = s(j) = s(k) 


Fig. 1. A stretch of road partitioned in cells and FCD segments. Splitting ratios are shown from a cell e to following cells, j 
and k. A FCD segment including, among others, cells e, j and k, is also shown. 


already accessible almost everywhere once a data collecting mechanism is deployed. We partition 
thus the network in segments, let S be the set of all segments, and we assume that a measurement 
of average speed is available for every segment s 6 S. 

Despite such advantages, FCD also have drawbacks. Aside from the already mentioned pos¬ 
sibly low penetration rate, information provided via FCD is usually averaged over a relatively 
long period of time. As an example, within the Grenoble Traffic Lab fixed sensors yield flow 
and density (and speed) measurements every T = 15 seconds. The FCD provided by INRIX 
are instead aggregated per minute, with standard practice ranging between 5 and 10 minutes. A 
comparison between sensors speed measurements from the GTL and FCD is provided in Figure |2j 
On the left panel, a comparison of FCD measurement and average of GTL for slow and fast 
lane at location Taillat on the Rocade Sud in Grenoble from 06:00:00 to 12:00:00 on April 24th, 


2014 (see Paragraph V-A.2 for details), which clearly illustrates the high measurement rate of 
sensors and the averaging effect of FCD, resulting in a smoother signal. On the right panel, a 
comparison of the two for all the cells on the main line at 08:00:00, showing that on average 
the measurements are in agreement. 

To formalize the scenario, we consider new speed aggregate data to be available every N 
time instants, N = kT for some k > 1, namely, at times N, 2N, 3N, ..., corresponding 
to the average speed in the periods [0, AT — 1], [A r , 2A r — 1], etc., respectively. As such, speed 
measurements can be formally written as 


,FCD 


(*) = 


J e > 
,,FCD / 


( 6 ) 


f G [0, AT — 1] 

_ S D m, t 6 [kN, (k + 1)N - 1] 

where 

• vg > 0 is the freeflow speed on cell e, namely, the speed of vehicles in low density regime; 
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Fig. 2. Comparison of speed measurements from sensors and Floating Car Data. 


FCD 

J s{e) 


( k ) is given by 


FCD 

V s(e) 


(k) 


1 

NKe)\ 


V i( T ) + W 5e) D ( fc ) 

j£s(e),T£l t 


where s(e) denotes the segment of which e is one of the cells (see Figure 0. 
a measurement error whose stochastic properties depend on the performance of the sensor 
as well as on road and weather conditions, and l t = (r : |_-^J — 1 < ^ < L^J}- 


III. Density Reconstruction 

The first problem we address in this paper is Density Reconstruction on the basis of hetero¬ 
geneous sources of information. In particular, our aim is to build an observer for the densities 
of vehicles in all cells of the network given static sensor measurements and Floating Car Data. 

We start by observing that Eqs. ([l)-([2]) cannot be directly used to observe the system except for 
the ideal scenario in which ideal measurements of the outflows f° ut (t), for all e £ £ and for all 
times t > 0, and of the initial conditions of the system, are available. Such a naive observer would 
however be very sensible to noise, as notice that while errors in the initial conditions correspond 
to offsets during the evolution of the system, noises in the flow measurements are integrated by 
the system’s dynamics, thus possibly producing unbounded and/or unrealistic results. Since real 
systems are never error free, Eqs. ([l])-© cannot be directly used to observe the system. 
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We solve this difficulty by considering the following standard Luenberger-like observer 


Pe(t + 1 ) = p e (t) + - f° Ut e (t)) + K(p e (t) ~ p e (t )) 




Ir\t) = 


Ve G £ 


Pe(t) = 


(7) 


where 

• p e (t) is the estimate of the density on cell e at time t; 

• fe 1 (7 )5 are estimates, based on the flow measurements, of inflow and outflow in cell 

e at time t; 

• p e (t) is an density pseudo-measure, based on flow and speed measurements, of the density 
on cell e at time t; 

• ft is a tunable gain trading off between flow and density pseudo-measure; 

. = [^(t)U £m and v FCD (t) = [v FCD (t)] ee s are the stacked versions of flow and speed 

measurements. 

We are interested in the following: 

Problem 1 (Flow and Density Reconstruction using Heterogeneous Sources): Design the maps 

r ={/fW : Rf r £ 

/ oul = {/““}=« : Rf -»■ r £ 

P — {Pe}e££ ■ R+ x R+ ~^ 


to minimize the absolute errors with respect to real flows and densities 


a p (t,e) = | p e (t) - p e (t) | 


a ¥ ’(f, e) 


- /rw 


( 8 ) 


A. A nonlinear observer for traffic networks 

In this subsection we describe the proposed solution to Problem [I] It consists in an offline 
calibration procedure and an online filtering step. 
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1) Offline calibration: This paragraph is devoted to providing a solution for calibrating the 
Fundamental Diagram. We employ a gradient descent strategy, similarly to [23], for which we 
do not require a CTM formulation as in [24]. 

Recall that the Fundamental Diagram is the graph of the function <p e (-) that is the flow of 
vehicles at the point where sensors are placed on cell e and whose argument is the density of 
vehicles on cell e. As such, it can be only estimated on cells e E S m , where measurements of 
flow and density are available. For a cell j E £\ £ m , we assume that the Fundamental Diagram 
can be estimated by extending by linear interpolation the parameters on the cells in £ m that are 
close to j, with coefficients that depend on the mutual distance between the cells and on the 
type of road j belongs to. 

The profile of Fundamental Diagram that we consider in the present paper is the following 

VeP, P< P c e 

a e p 2 + b e p + c e , p> p c e 

where 

• Pg is the critical density. It partitions the set of densities [0, p' ;im ] into the freeflow low- 
density region [0,Pg), in which the mutual influence of vehicles is small, from the high- 
density congested region (p c e , p|? m ], in which speed decreases with density due to interaction 
of close vehicles; 

• pi, am is the jam density, at which vehicles are so close one each other that they are unable 
to travel; 

• Vg > 0 is the freeflow speed on cell e; the value C e = v^p c e is the capacity of the section 
or road, namely, the maximum number of vehicles that can flow through it during a period 
T; 

• we assume that the Fundamental Diagram congested region is a convex quadratic function of 
the density. The following relations among the parameters a e , b e and c e hold for consistency 

«e (Pe) 2 + b e p c e + c e = v*(f e 
a e (pi am ) 2 + b ef )r + c e = o 

a e > 0 
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Remark 2. We defined <p e (') to be the flow of vehicles through a section during a sample time 
T. As such, the units of the speed vjf are km per T. 

Remark 3. A standard choice for the Fundamental Diagram is the a triangular Fundamental 
Diagram, for which in the congested region 

M P e ) = - Pe) 

where co e is the wave speed at section e. Clearly, our model recovers the latter with a e = 0, 
b e = —’jjp, and c e = u e p ] f m . The choice of a quadratic Fundamental Diagram has been driven 
by the empirical obseri’ation, based on data on our experimental setting, that the triangular 
diagram tends to overestimate the flow in congestion, as it will be shown in Section [T] 

An alternative appealing solution which fits our data is the inverted-\ fundamental diagram 
[25]. However, the number of parameters to be estimated is higher in the latter case, and the 
resulting model is more complex as it involves hysteresis. 

Motivated by the well known issue that deterministic Fundamental Diagrams are in any case 
only a rough approximation of the relation between flow and density, we chose the quadratic 
profile because of it is simple to calibrate and to use. 

We describe now the procedure for the calibration of the Fundamental Diagram. Let e 6 8 rn 
and let <p™ k )}k£ic, /C = {1,... ,K}, the set of K density and flow measurements used 

as learning set and obtained via the fixed sensor on cell e. For sake of notation, and since all 
variables refer to cell e only, let us write from now on pf and pf instead of pf k and p'f k , and 
the same for the parameters of the Fundamental Diagram. 

The proposed calibration procedure requires two steps 

. Estimation of p c and C = v s p c : the first step consists in estimating the critical density 
and the capacity C of the cell. We consider the standard least square estimation, which 
results into the following non-linear and non-convex minimization problem: given the set 


July 28, 2015 


DRAFT 


15 


of measures {(p%, cp^)} keK , K, = {1,..., K }, solve 


min( p c )C ) V,c) - | Ef=i(Xfc “ X( P c ,C){Pk )) 2 
s.t. 0 < p c < />i am 


C > 0 

X(p c ,C)(^) 



CV am —z) 

pjam — pC ? 


x < p c 

X > p c 


(9) 


We aim to solve ([9]) by the following gradient descent with diminishing stepsize algorithm 

- Basic step: initialize Pq-,Cq. A reasonable choice is p f {] = 20, which corresponds to the 
vehicles influencing one each other when the average distance among them is less than 
50 meters, and Co = 'i'J unit po, where r-'' 11111 is the speed limit on cell e normalized by 
the sampling time T; 

- n-th step: let (jf n , C n ) descend along the gradient of the cost, namely 


Pn+l ~ Pn n V p AV )C ) 
C n+} C n ^c^(p c ,c) 


with 


V p cV(pC iC ) 


VcV(pc )C ) 


X FF (p c ) 


X c (p c ) 


XI (^ fc _ Vip^-vCn-MPk)) 


fc6Z FF (p a ) 


:Pfe 


pi am - p k 


X ^(p a _i,C n -l)(Pfc)) Cn-1 ( ; am c ) 2 

feeze (pc) v 7 ^ Pn—l) 

X (^ fc _ ^(P^-iCn-O^fe)) 

TFF ,„„, v 7 A-i 

i)(Pfc) 


fcez FF ( P a ) 


- X [Vk ~ <P{fi_ x ,C, 

k€l c (p c ) 

{k G /C : 0 < < p c } 

{A: G /C : p k > p c } 


(j am - Pk 
C am - Pn—l 


where the gradients V p cV( p c,c) and VcV( P c ,c) arc computed at (jf , C) = (p^_ 1; C n _i), 
and 5 > 0 is a fixed initial step size. Notice that if p c n = 0 then X 1 1 (p c ) = 0, and 
conversely when p c n = (> y '" n then X c {p c ) = 0, and thus the previous summations are 
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always well defined. Nonetheless, for numerical reasons, additional care should be taken 
in order to avoid p c < 0 or p c > pj am , for example projecting at each step p c n+l into 
[0, p y, ' m ] after the gradient update. 


- Stopping criterion: stop if 


Pn 

c n 


Pn -1 

C n -! 


< e for some small threshold e > 0. 


• Calibration of the quadratic function in the congested region: the problem of calibrating the 
quadratic function for the congested region is cast into the quadratic problem: given the set 
of measures {(p£\ ip™)}keic, /C = {1,..., K}, solve 

min (a,6,c) Efcexc (pC) - (apl + bp k + c )) 2 

s.t. a ( p c ) 2 + bp c + c e = C 

2 ( 10 ) 
a (pi am y + V am + c = 0 

a > 0 

The problem ( [T0| ) is computationally very simple and can be solved using off-the-shelf 
tools. 

Notice that as side products of the previous procedure we can compute the freeflow speed 
as Vff = C/p c and, in case a bilinear Fundamental Diagram is also needed, the wave speed as 
bj = — C/ (pl am — p c ), for each cell. 

2) Online density reconstruction algorithm: We assume from now on that Fundamental Di¬ 
agrams have either been calibrated or extended on the whole network, and that the matrix of 
splitting ratios has been pre-specified or estimated on the basis of field surveys. 

We propose the following online algorithm for Density Reconstruction 

• at the beginning of the t-th time slot, a centralized computation unit 

- receives measurements {<^(t)} e e£ m ; 

- flow estimation: estimates the vector of outflows f oui (t) by solving the following 
minimization problem 


min /out ||(/-i?' r )/ out || 2 + 7 ^; ee£m (/ e out -^(f)) 2 
s.t. / out > 0 

Problem ( [TT| ) aims to a) match outflows and measurements where available, by penal¬ 
izing the squared difference between the two, and b) to balance outflows according to 
the splitting ratios. The latter term provides the estimate of flows on cells in which no 
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measurement is available, and is performed “as if” the network were at steady state, 
which is a simplifying assumption due to absence of a dynamical model for flows. 
The tunable parameter 7 selects whether more weight is given to matching estimated 
outflows and measurements (high 7 ), or to estimate the flows as if the network were at 
steady state (low 7 ). Once Problem ( [IT] ) is solved, the vector of estimate of the inflows 
is easily computed according to Eq. [ 2 } by setting f m (t ) = R 1 f out (t). 

- receives the measurements {v™(t)} ee £ when available, or holds the last speed mea¬ 
surements received; 

- For each cell e, computes the two possible densities pi (freeflow) and pi (congested) 
corresponding to flow f° ut (t ) assuming that the local flow <p e = ip e {p e ) is exactly 
determined by the Fundamental Diagram; 

rout ~ iout 

- For each cell e, computes the two velocities v e {p e ) = ljL T - (freeflow) and v e (p e ) = 

Pe Pe 

(congested); 

- Selects 

Pe(t) = arg min{|u e (pg) — v™(t)\} 

1 = 1,2 

as a rough estimate of the density. This estimate is only based on the actual measure¬ 
ments of flow and speed, and is generally very noisy, especially when the cell is in 
congestion. Therefore, the algorithm does not directly uses it; 

- density estimation : for each cell e G S, lets the density estimate evolve according to 
the observer equation ([7]). 

Notice that in the proposed solution we avoid using Kalman filter based strategies as, due to 
the high degree of nonlinearity and uncertainty of the system. Analysis and minimization of the 
error variance of the proposed solution is left for future research. 

IV. Optimal Sensor Placement 

The second problem we tackle in this paper is Optimal Sensor Placement, namely, the problem 
of deciding the position of sensors yielding a good trade off between performance and cost. 
Since assessing in a theoretical way the performance of algorithms for density reconstruction is 
difficult due to the nonlinear nature of the system, we simplify the setting and limit our attention 
to estimation of cumulative flows, namely, of the total outflows from the cells. The resulting, 
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static, problem is then considered as a proxy for the more complicated problem built on the 
dynamic density model. 

We start by deriving some properties of cumulative outflows, and we proceed describing a 
simple linear model for cumulative flows estimation. This will help us formalizing the Optimal 
Placement problem. 

A. Flow linear constraints 

Let f e \= Y?k=t /° ut (^) be the cumulative outflow from cell e, namely, the total flow through 

° r i t 

the cell over the period of time [f 0 ,fi], and let / = /, ... fl £ | e be the vector of 

cumulative outflows. By integrating the system’s dynamics we have 

4(Pe(tl) - Pe(t 0 )) = R i*h eE£ \ ^ 

j&£ 

Assume now that [f 0 ,fi] is a period of time whose duration is high enough, and that at both 
times to and t\ the number of vehicles in the network is low, for example, assume that £ 0 and 
t\ correspond to two consecutive midnights. Then the magnitude of the vector of differences of 
vehicles {£ e (Pe(ti) — pJjv)}e&£\n' is small if compared to the cumulative flows in the network, 
and the following relation holds approximately 

Lf ~ 0 , (12) 

where L e WS''' n ’' x£ , is the matrix obtained by removing from L = R T —I the rows corresponding 
to onramps. This imposes a linear constraint on the cumulative flows that we shall exploit in 
the next subsection. 

B. Linear measurement model and the Optimal Sensor Placement problem 

In this subsection we study the performance of a linear estimator of the cumulative outflows 
and we show how to formalize the problem of Optimal Sensor Placement. 

Let £ m C S be a set of cells in which sensors are placed. We assume the following simple 
linear measurement model 

y = H S mf + CO f (13) 

where 
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• y s is the measurement of the s-th sensor, namely f e + r/ s if the s-th sensor is located on 
cell e; 

• Hgm g {0,+l} pxn , p = \E m \, [Hgm\ se = 1 if the s-th sensor is located on cell e, and 
[H £ m\ se = 0 otherwise, so that Hgm 1 = 1 and l T Hg m l = p; 

• ojf is a random noise vector with zero mean and covariance matrix E nom , related to the 
measurement noise oj ip described in the previous sections. For sake of simplicity, we shall 
often assume that the components of the noise, one for each sensor, are independent with 
same variance al om , so that E nom = <rl om I. 

Let now n = \S\ and r = rank{L }, and consider a matrix V G M nxr whose columns are an 
orthonormal basis of the right kernel of L T , i.e., L T V = 0 and V T V = I. From ( [12] ) we get 
(approximatively) / = Vz for some z G M r , so that the measurement model can be rewritten as 

y = HgmVz + a/ . 


Given y, consider a linear estimator of z, z — K z y + q z , where K z G W xp and q z G M 
Best (minimum variance) Linear Unbiased Estimator of z corresponds to the solution to 

min K z , qz E[(z-i)(^-£) r ] 

s.t. E[z — z] = 0 

2 = K z y + q z 

The following Lemma formulate an equivalent problem. The proof is straightforward. 


(14) 

The 


(15) 


(16) 


Lemma 1. The solution ( K z ,q z ) to ( fl?] ) is given by q z = 0 and K z the solution of 

min^ z K Z T, 

nom K 

s.t. K z Hg m V — I 

which is 

K z = 

with error covariance 

E[(j -z)(z- z) T ] = (V T Hip^ m H em V)-' 

An immediate consequence of Lemma |T] is that the Best Linear Unbiased Estimator (BLUE) 
of / is 

/ = K f y = U(U T Fj ra E- 1 m ff fm U)- 1 U T fF£E- 1 m r / 
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and its error covariance is 

V r (£ m ) = E [(/ -/)(/- /) r ] 

= V(V T Hj^;J m H £ ,„V)-'V T . 

The quantity V p (£ m ) depends on a) the (right kernel of the) matrix of splitting ratios via the 
matrix V and the nominal variance of the noise ujf, two parameters that are assigned, and b) on 
the locations of the sensors, which is the set £ m , via the matrix H £ m. For this reason, we will 
take the magnitude of V p (£ m ), measured via its trace, as our metric to measure the performance 
of a sensor network placed on the cells £ m . 

Clearly, with no additional constraints the optimal placement is to equip every cell with 
sensors. This is straightforward as equipping all cells means setting H s = I, and from HjH £ = 
I > I lj m 11 £ m immediately descends V p (£) < V p [£ rn ), for any £ rn C £. 

Each device has however a non-negligible purchase and maintenance cost, which has to be 
considered when designing the sensor network. For sake of simplicity, in this paper we make the 
simplifying assumption that the cost of a network over its lifetime is proportional to its number 
of sensors via a coefficient c > 0, so that the cost of deploying sensors on £ m is c\£ m \. 

We thus consider the following 

Problem 2 (Optimal Sensor Placement): Let Q = (V, £ ') be a traffic network with splitting ratios 
R and cumulative flows noise variance cr^ om . Find £ rn which solves 

min £ m trace {V p (£ m )} + c\£ m \ (17) 

The optimal £ m , solution to ( [XT] ), trades off between the network performance, which is 
measured by the trace of the estimator error covariance, and the total cost of the network. 
Clearly, the two have a contrasting effect on the number of deployed sensors. It is however 
inherently combinatorial, the optimal position of the sensor being in general hard to find and 
requiring an exhaustive search among all the possibilities, which is intractable even for relatively 
small network dimensions. 

We approach the problem by proposing an heuristic that relaxes it into a convex problem. Such 
a strategy is described after the following brief discussion on the minimum required number of 
sensors. 
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C. Minimum number of sensors 

Before proposing our method for solving ( fj~7| ), we prove that there exists a lower bound on 
the number of sensors \E m \ in order the trace {V p (8 m )} to be finite. As it will be proven, 
below such number, which corresponds to the number of onramps of the system, the problem 
of reconstruction of flows admits infinite solutions. 

To this aim, relabel the cells in such a way that onramps are the first 1,..., \1Z 1 \ cells so that 
we can partition L as 



where L nn model the mutual influence of flows on non-onramp cells, and L on models the 
influence of onramps on non-onramp cells. 

Define now a dual graph Q d = (V 1 . 8 d ) in which the roles of cells and junctions are in a way 
reversed, and in particular in which V d = 8 \ 1Z L and (e, j) G 8 d if e f j and [L nn ] e j f 0. Then 
it is easy to see that Lf rn is a sublaplacian of Q d . The following result is adapted from [26]. 

Lemma 2. Let Q = (V,£) be a graph and J G Rl v l x l v l be a sublaplacian of Q. Then all the 
eigenvalues of J have negative real part except possibly eigenvalues in 0. Moreover, if S is the 
set of nodes v for which J vu < 0, then J is invertible if for every u there exists a directed 
path in Q from u to a node v G S. 

In the case under analysis, we take J = Lf, and S to be the set of cells directly following 
an onramp, namely S — {e G 8 : 3j G 1Z 1 : Rj e > 0}. Recall that by assumption for every cell 
e VJ there exists an origin j G TZ l and a path from j to e, so there must also be a k G S and 
a path from k to e (at most being k = e), so the assumptions of Lemma [2] are satisfied. This 
establishes that L nn is invertible, and therefore that L is a full row-rank matrix with rank \8\7Z l \. 
Since the number of columns of L is \8\, it follows that its kernel has dimension r = \TV\, and 
thus rankjl/} = \TV\. 

We offer two interpretations on this fact: 

• Ideal measurement scenario: assume p = \8 rn \ ideal measurements at sensors s(l), ..., s(p) 
are available, ypp = f s ^, i = 1,... ,p. Then a solution to the system of equations 

\lf = 0 

I fs(i) fs(i) i 1 1 • • • iP 
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is a candidate vector of cumulative flows. Then, if p > \7Z l \ the system has a unique solution 
which is the true vector of flows. Conversely, if p < 1Z' \. the system is undetermined; 

• Noisy measurement scenario: assume p = \£ m \ noisy measurements at sensors s(l), ..., s(p) 
are available and assume to adopt the Best Linear Unbiased Estimator presented above to 
estimate the flows. If p = £ rn < 7U|, then rank { ll[„, He"' } < \1Z l \ = rank { V}, which 
implies that rank [V T Hj m H £ mV] < \7V\. However, V T Hj m H £ mV E mI 7 ^I x I^ 1 I j so the 
matrix is singular, and therefore the trace of the error covariance is unbounded. Conversely, 
if p > 7T then rank \V r H £m H £ m V } = 7C| and the trace of the error covariance is 
bounded. 


D. Relaxation via Virtual Variances 

The solution that we propose is based on the observation that cells that are not endowed with 
sensors can be interpreted as cells in which sensors have infinite noise variance. It turns out that 
an equivalent formulation of ( [T7] ) is 

niin^m trace {V {y T Yy)~ l V T } + ca^ om l T T,~ x l 

Too, e££ m (18) 


s.t. 


EU = 


ry~ p CL p r 

u nom) c c o 


In fact 

• the term H £ m, which represents which cells are endowed with a sensor, is the identity 
matrix, namely, all cells are endowed with a sensor - except, some of them have infinite 
noise variance and thus provide no information; 

• the second term in the cost corresponds to c\£ m \ as 


cov 


1 t E" 1 1 = ca: 


2 

nom 


\ee£ m nom / 


In other terms, ( fl8[ ) corresponds to assigning a virtual variance E ee = a z e to each sensor, and 
decide for which it should be o\ = cr^ om , the cells in £"\ and for which it should be a 2 e = Too. 

We call E the (diagonal) matrix of virtual variances, and we let the corresponding trace of 
error covariance be denoted, with an abuse of notation, Vp(E) = trace {U(U T EU) -1 U T }. 

Our approach is then based on the intuitive idea that increasing the variance on the sensors 
that are not crucial for the solution of ([T7]) should not have a strong effect on the performance 
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term V P (E). More formally, we consider the following relaxed version of the previous problem 

min SeBn trace {V(y T Y,~ l V)~ 1 V T } + ca^ om l T E^ 1 l 
s.t. £ ee > crl om , Ve G E 


where O n is the set of diagonal matrices of dimension n. 

We now slightly rewrite the cost. First, by the well known property of trace trace {AB} = 
trace{I?A} and V T V = /, it follows trace {! f {V T H~ l V)~ 1 V T } = trace 
Second, we make the change of variables Q = E _1 . In this way, we obtain the following 
problem 


minn eDn trace {(V t VLV) 1 }+ 7 l T fil 
s.t. 0 < ft < S-' 


( 20 ) 


where 7 is a tunable parameter. The choice 7 = ca'l om is a natural one due to the previous 
discussion, but since 7 influences the relative weight of performance (penalized for low 7 ) and 
cost (penalized for high 7 ) we leave it as an additional degree of freedom. Notice, finally, that 
the term l T f)l corresponds to the t\ norm of the inverse of the variances, a term which is 
commonly used term to sparsify solutions of optimization problems. 

The Virtual Variance algorithm proceeds as follows: 


1 ) solve ( | 20 | ) and denote by Q its solution; 

2 ) compute £ _1 = Q 

3) discard all cells whose virtual variance is above a fixed discard threshold Td 

As explained above, if the found solution provides high virtual variances at locations where 
sensors are redundant then this is effectively a way to select the most important cells where to 
place sensors. 

However, this strategy does not, in general, provide good solutions to the problem. Indeed, 
numerical simulations have shown that in the considered scenario the solution of ( [ 20 ] ) can be 
often interpreted as endowing all cells of the network with sensors with average virtual variance, 
rather than keeping it low in some of them and high in others. 


In order to enhance diversity between sensors, we enrich the cost of ( f20| ) with a term that aims 
to penalize homogeneity. This is reminiscent of dissensus (as opposed to consensus) strategies 
in multi agent networks, in which each agent possesses a value and the goal of the network is 
to differentiate as much as possible such values. 


July 28, 2015 


DRAFT 




24 


In this paper, we make the following simple choice. Let W e M nxn ~ 1 be an orthonormal 
base of the subspace orthogonal to 1, that is, W*1 = 0 and W*W = I. We add to the cost in 
( |20| ) a term that is proportional to e l7 " * ni . Since the columns of W span the orthogonal to 
1, W*Vtl is high when the element on the diagonal of Q, which are gathered in the vector 01, 
are different one with respect to the other, and is low otherwise. 

We propose the following optimization problem 


minn e B n trace {(I/ T f2H) + 7 l T fil + ne xTw * ni 

S.t. 0 < Q < cr -2 / 

which, notice, is convex in the diagonal entries of fi. Here 7 , the total variance weight, and k, the 
discrepancy weight, are tunable parameters. Notice that high 7 penalizes the number of sensors, 
thus yielding to solutions with higher virtual variances at the expense of poor performance. 

We shall provide in the following paragraphs examples of application of the previous optimiza¬ 
tion problem for different sensor placement scenarios. We anticipate here that one can observe, 
as required, that the virtual variances solution of ( |2Tj ) are distributed in a strongly bimodal way, 
with low and high values being different by several orders of magnitude. As a consequence, it 
is usually easy to distinguish among the two groups and discard cells whose contribution to the 
performance metric would be negligible. 

I) Optimal Sensor Placement with geographical or budget constraints: In this paragraph 
we discuss two variations of the previous procedure, which address the additional problems of 
geographical constraints and of strict budget limitation. 


Optimal Sensor Placement with geographical constraints 


The first scenario we address is concerned with the scenario in which in which a) some cells 
cannot be endowed with sensors, for example for physical reasons, and/or b) subsets of cells for 
which either all cells are endowed with sensors, or none are. An example of the latter constraints 
is a multi-lane road which is modelled using parallel cells (on different lanes) and on which the 
traffic manager can deploy induction loop. The former are buried underground are are usually 
required to cover the whole carriageway. As such, if i and j are, for example, two parallel cells 
on the two lanes on a certain section or road, then either i and j are both equipped with a sensor 
(i.e., the induction loop), or not. 
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To encompass this type of constraints in the proposed procedure, let £ am C £, £ am = k, be 


the subset of available cells, and let Hg am € {0, +l} KXn be built as in Subsection IV-B Further, 
let S C be the set of subsets of £ which must be simultaneously equipped, or not, with 
sensors, where V s is the powerset, or set of subsets, of £. 

Then the following problem 

minneD fc trace {{V T Hj am ClH £am V)~ 1 } + yl T £ll + Ke- l ' w ' {11 
S.t. 0 < n < cr“ 0 2 m J 

flii = € cr, Vcr £ S 


( 22 ) 


is (j2Tj) once we constrain sensors to be place on cells in £ am only and simultaneous sensor 
placement to happen according to the constraints specified by the set S. Notice that the diagonal 
entries of the solution Q are the inverse of the virtual variances on the cells in £ am only. The 
matrix W is defined as previously, but with suitable dimension (k x k — 1 ). As in the general 
problem, cells are chosen only if the corresponding virtual variance is below a certain threshold, 
and clearly cells that are not in £ am cannot be chosen. Notice that the constraint Q,, = 1 1 :)3 in 
|), or the less requiring < e, for some tunable parameter e, are convex constraints, 


so ( 22 ) remains convex. 


Remark 4. By the discussion in Subsection IV-C the minimum number of sensors is r = 72 


As such, if \£ em \ < |72*| the problem ( |22| ) is not well posed and the solution will only have very 
high virtual variances. Clearly, such a solution is not acceptable and should be discarded. 


2) Optimal Sensor Placement with budget constraints: In this second scenario we discuss 
budget constraints in the form of constraints on the maximum number of chosen sensors, a very 
common requirement in real-case applications. 

We propose a solution based on the following iterative approach: 

• Initialization: set 7 ( 0 ) and n to some prespecified nonnegative values; t max to the maximum 
number of iterations; n max to the maximum number of sensors; 

• t-th step 

- The problem ( f2~i~[ ) is solved with 7 = 7 (7); 

- Let n(t) be the number of sensors in the solution provided by the Virtual Variance 
algorithm. Then 
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* If n(t) < n max , or if t > f max , the procedure stops; 

* Otherwise, set 7(£ + 1) = g(y(i)), where g is a monotonically increasing increasing 
function, and the procedure iterates. 

The rationale behind this procedure is that, as previously discussed, 7 penalizes a low total 
sum of the virtual variances. Therefore, by iteratively increasing 7 the solution to ( fTi~| ) will tend 
to exhibit more and more high virtual variances, thus reducing the number of sensors. 


Remark 5. Once again, and related to the discussion in Remark |4] the specified maximum 
number of sensors cannot be be less than r = R by the results presented in Section 
If this is not the case, numerical experiments show that the algorithm either simply iterates 
until the number of iterations reaches £ max , or the found solution exhibits an extremely high 
trace { (V T - due to the fact that the internal matrix is (numerically) almost 
singular. As in Remark [?] such a solution is not acceptable and should be discarded. 


IV-C 


V. Numerical experiments 
A. Numerical Experiments for the Optimal Sensor Placement 

In this subsection we present the results of two numerical experiments. In the first, we solve 
the problem of Optimal Sensor Placement in a small (but not trivial) regular grid, in the second, 
we apply the procedure to the real-world case of the Rocade Sud. 

1) Regular grid: Consider the regular grid composed of 25 cells shown in Figure [3j We want 
to solve the problem ( fl7| ) by ( [20] ) with parameters (jf, m = 1, nominal sensor variance, and c = 1, 
cost of a single sensor. 

The network is small enough to run an exhaustive search to solve the problem ( [XT] ). In 
particular, for each h = 4, 5, 6,..., 21, we try all possible combinations of h = \E m \ sensors, 
thus finding the one that minimizes V p (£ m ). 

We also run our Virtual Variance algorithm with total variance weight 7 = 2 and discrepancy 
weight k = 20. Further, we set the threshold on the virtual variance for discarding a sensor to 

T d = 100. 

We illustrate the results in Figures [3] and |4j In Figures [3] the 8 cells chosen by the Virtual 
Variance algorithm are represented as a green dot marks, and the best possible placement with 
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E 6 7 


12 13 


19 20 21 

4 22 23 24 (> 25 


Fig. 3. The regular grid network used in the numerical experiment. The green dots correspond to the 8 cells selected by the 
Virtual Variance algorithm with 7 = 2 and k = 20. The red dots correspond to cells selected via exhaustive search when the 
number of possible sensors is 8. 


8 sensors (found by exhaustive search) as red dot marks. It can be noticed that both procedures 
place most of sensors at the boundaries of the network. 

Figure [4] shows instead the total cost of the best placement obtained through exhaustive search 
for h — 4,..., 21, and the total cost found by the Virtual Variance algorithm with 8 sensors. 
By total cost we mean the sum of estimation performance and network cost V p (£ m ) + c\£ m \, 
which, notice, is not the metric that is used in the Virtual Variance algorithm. Nonetheless, it is 
appreciable that the Virtual Variance algorithm not only provides a solution whose number of 
sensors is close to the global optimum (which is with 6 sensors), but also that, using 8 sensors, 
the Virtual Variance algorithm places them almost in the optimal way. 

2) Rocade Sud: Our second experimental setting is the Grenoble Traffic Lab (GTL), a network 
of sensors deployed for monitoring and research purposes along the Rocade Sud, a peri-urban 
10.5 km long freeway connecting the two highways A41 (north-west) to A480 (south) in the town 
of Grenoble in the south of France, see Figure [5} The network is composed of 135 magnetometers 
buried in the ground on both lanes along the main line every 250 meters (on average), on each 
onramp and offramp, and on three connectors from urban roads to three onramps, for a total 
of 68 sensing locations. For our purposes, each sensing location will correspond to one pair 
of sensors. For a detailed report on the GTL, we refer to [20]. Furthermore, and for sake of 
simplicity, while in the real network sensors are deployed in pairs, we shall assume from now 
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Fig. 4. Results of the exhaustive search and of the virtual variance algorithm. 


on that each sensing location has only one sensor (as we shall always discard sensor speed 
measurements). 

Figure [5] shows the position of each of the 22 sections of the main line in which there are 
sensing locations on both slow and fast lanes (and usually a ramp). In the same figure we also 
show a stylized representation of the freeway, including ramps and queues, the positions of the 
68 sensing locations, and the distance between consecutive measurement sections along the main 
line. 

We partition the Rocade in cells in such a way each cell includes one sensor, so that in 
Figure [5] each numbered circle also corresponds to one cell. 

Here, we do not consider onramps and offramp, limiting our attention to the main line 
of the Rocade Sud. The reason for this choice is that the Rocade has 10 onramps along its 
main line, which, summed to the two cells in the very first section of road, imply a minimum 


number of sensor of 12 by the discussion in Subsection IV-C While this number is not high 
on its own, numerical experiments not reported in this paper have shown that good estimation 
performance require a number of sensors which is too high in most realistic (i.e., non academical) 
implementations. 

The corresponding reduced graph representation, essentially made of several groups of parallel 
edges, consists of 46 cells. We provide a stylized version of it in Figure [6} 

The matrix of splitting ratios could be estimated from the data, but we make here a simpler 
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Fig. 5. The experimental setting: the town of Grenoble and the Rocade Sud and a stylized version of the freeway. The positions 
along the main line of the 22 sections of the main line in which sensors have been placed is shown as red pin. The positions 
of the 68 fixed sensing locations are shown in the stylized map. Each sensing location also corresponds to a cell. Light blue 
circles denote fixed sensors that are selected by the Virtual Variance algorithm and are used in the implementation of the Density 
Reconstruction algorithm. Each rectangle represents one FCD segment, often providing average speed measurements over more 
than one cell. 


assumption and assume that vehicles split according to the following rule: vehicles on slow 
lane cells remain on slow lane or turn into fast lane with a 70%-30% rule, and analogously for 
vehicles on the fast lane. If the next section has three parallel cells (cells 22-23-24 and cells 
66-67-68), vehicles spit uniformly in such three cells. 

We did not run an exhaustive search due to the relatively high dimension of the network and 
the consequent relevant computational load. Our comparison is instead with the locations of fixed 
loops installed by the Government Agency Centre national d’information routiere (CNIR) [27], 
which correspond to cells marked with a red dot in Figure [6j We shall show that the latter are 
positioned in a way that is in good accordance with the results of the Virtual Variance algorithm, 
even though our procedure allows for a slightly better design. 

We run the Virtual Variance algorithm in three scenarios: 1) unconstrained scenario with total 
variance weight 7 = 0.2 and discrepancy weight k = 20; 2) unconstrained scenario with 7 = 1 
and k = 20; 3) constrained scenario with number of sensors at most 10, initial 7 = 0.2, and 
k = 20. We assume that er„ om = 1 and that the cost per sensor is c = 1. In all cases, sensors are 
constrained either to be present in both lanes on a same section, or to be absent. 

The results are summarized in Figure [6] and Table |T| The tables shows the number of sensors in 
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Fig. 6. Stylized representation of the main line part of the Rocade Sud. White ovals represents junctions of the graph. The 
selected positions of 17 fixed sensors by CNIR are marked by red dots, those of the 16 sensors chosen by the Virtual Variance 
algorithm by green dots. 


Scenario 

7 

optimal 

# sensors 

V p (S m ) 

V(£ m ) 

Fix 


17 

3.8161 

20.8161 

Unconstrained 

0.2 

14 

3.6867 

17.6867 

Unconstrained 

1 

10 

5.1732 

15.1732 

Constrained, # < 12 

0.32 

12 

4.4972 

16.4972 


TABLE I 

Results of the four considered scenarios. 


the solution computed via the Virtual Variance algorithm, as well as the trace of the corresponding 
estimation error covariance V p (£ m ) and the total cost V(£ m ) = V p (£ m ) + c\£ m \. In Figure [6] 
cells found in the unconstrained scenario with 7 = 0.2 are marked with a green dot. As can be 
seen in Table [TJ our algorithm requires one less sensors than the network deployed by CNIR and 
in addition the trace of the error covariance is smaller. 

In the constrained scenario and in the unconstrained scenario with high 7 (which, as explained 
above, indirectly penalizes the number of sensors), the trace of the error covariance V p (£ m ) 
increases, as expected. Furthermore, the chosen cells in the latter two cases are subsets of the 
cells chosen in the unconstrained case: in particular, in the constrained scenario all cells are kept 
except 13, 14, 49 and 50, and in the unconstrained scenario with high 7 the algorithm further 
discards cells 64 and 65 
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B. Density reconstruction - experimental results 

We provide here numerical results of the implementation of the data fusion algorithm for 
density reconstruction on data from the Rocade Sud. 

On each sensing location and every T — 15 seconds, the system counts the number ip™ of 
vehicles that crossed the location, their average speed v™, and the average occupancy o'" of 
the location. Since the latter is approximatively proportional to the density of vehicles, so we 
shall assume that sensors can directly measure densities. In addition to fixed sensors, we use 
Floating Car Data provided by INRIX. Following INRIX schema, the Rocade has been further 
partitioned into FCD segments. One measurement of average speed is available on each FCD 
segment every 1 minutes. FCD segments partition the whole main line of the Rocade and include 
most onramps and offramps, but single lanes are not distinguished along the main line. FCD 
segments are represented in Figure [5] as rectangles encircling several sensing locations/cells. 

For our experiments, we employ the sensor configuration obtained in the previous section via 
the Virtual Variance algorithm with 7 = 0.2. In particular, and in order to prove that the method 
shows good performance even with sparse equipment, we only use the sensors on the main line, 
which are, for reference, shown in light blue in Figure [5} Further, we don’t use any information 
on flow or speed on the ramps. 

Calibration of the Fundamental Diagram was performed via the algorithm described in Para¬ 


graph III-A.l and using the data from the GTL sensor network from April 10th, 2014, a working 
day (a Thursday) exhibiting very standard traffic pattern: 

• very limited night time traffic; 

• a peak of congestion in the morning ( 8:00 - 10 : 00 ), triggered by vehicles exiting towards 
the city from the Rocade at the offramp of Eybens (cells 37/38) and spilling back until 
Meylan (cells 2/3), and a second, smaller peak of congestion triggered by vehicles entering 
in A480 at Rondeau (cells 66/67/68) but blocked by the high traffic on A480, and spilling 
back until around Liberation (cells 61/62); 

• a third, smaller, congestion triggered around Eybens around 14:00-15:00; 

• in general, medium/heavy but fluid traffic from 10:00 to 16:00 

. a second peak of congestion in the afternoon, again triggered by congestion at Rondeau 
at around 16:00, spilling back on the whole freeway in around 60 minutes, and lasting 
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Cell 37 - Fundamental Diagram 



Fig. 7. Calibration of the Fundamental Diagram on the cell Eybens exit - slow lane. The linear-convex Fundamental Diagram, 
calibrated using data from April 10th, 2014, is shown in thick line. The dashed thick line represents the corresponding linear 
Fundamental Diagram in congested regime. Each cross is a (flow, density) pair measured on April 24th, 2014, one for each time 
slot of T = 15 seconds during the whole day. Flow is the number of counted vehicles crossing the sensing location, density is 
the measured density of vehicles, during the time slot. 


approximatively two hours. 

As for the previous section, we consider the following simple rules to set the matrix of splitting 
ratios 

• let e be a fast lane cell. Then 70% of vehicles continue on the fast lane cell and 30% turn 
into the slow lane cell; if in the following section there are three parallel cells, vehicles 
split uniformly; 

• let e be a slow lane cell. If among the following cells there is not an offramp, then 70% 
of vehicles continue on the slow lane cell and 30% turn into the fast lane cell; if in the 
following section there are three parallel cells, vehicles split uniformly. Otherwise, 20% of 
the flow is directed towards the offramp, and the rest splits as previously specified; 

. if e is an onramp cell and j is the following slow ramp cell, then R, :) = 1. 

. if e is a queue cell and j is the following onramp cell, then R, :) = 1. 

In words, vehicles split according to a 70%-30% lane-change rule in the cells on the main line, 
and at each offramp approximatively 10% of vehicles exit from the freeway. 

1) Implementation: To assess our method, we considered the whole month of April 2014 
(except April 13th, a Sunday, for which FCD measurements were not provided). A typical result 
of calibration of the Fundamental Diagram is illustrated in Figure [7J which shows in thick black 
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Measured densities -2014-04-24 
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Fig. 8. Numerical results of density estimation on all cells: measured densities (left panel) and estimated densities right panel). 
Day: 24-04-2014 


the linear-convex Fundamental Diagram, in dashed thick black the corresponding standard linear 
Fundamental Digram in congestion regime, and as crosses the pairs (density, flow) measured on 
a day different from that used for calibration, in this case April 24th, 2014. As standard and 
well known, data in freeflow regime are in good accordance with the linear part, while data in 
congested regime are much more scattered and more difficult to fit. As it can be noticed, the 
standard bilinear Fundamental Diagram overestimates the flows in congested regime (the dashed 
think line is on average higher than the corresponding pairs (density, flow)), while the convex 
quadratic curve seems to better capture the average flow-density relation. Nonetheless, it is clear 
that the so found curve is only a very crude approximation of such a relation, which might be 
better captured using a stochastic description [28]. Investigation of the latter possibility will be 
the focus of future research. 

The proposed algorithm was implemented in Matlab on a non dedicated commercial laptop 
with 2.1 GHz i7-4600U CPU and 8 GB RAM. Optimization problems, required both in offline 
and online steps, were solved using standard Matlab functions as well as the modelling and 
optimization system CVX [29], [30]. The time required for calibration of the Fundamental 
Diagrams is between 30 and 40 seconds for each cell, while reconstruction of all the samples 
for a whole day requires less 10 minutes, averaging 100 ms per sample. 

Typical results are reported in Figures [8]{9| For validation purposes only, density and flow 
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Measured flows -2014-04-24 
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Fig. 9. Numerical results of flow estimation on all cells: measured densities (left panel) and estimated densities right panel). 
Day: 24-04-2014 


measurements from all GTL fixed sensors are considered ground truth. As such, the left panels 
show the evolution of the “true” measured density and flow in all the cells on the main line, 
over the whole day, for each day. On the x-axis, the 46 sensing locations along the main line 
(numbers correspond to the labels in Figure [5]), on the y-axis, the 5760 time slots over the 
whole validation day (one slot every T = 15 seconds). The chosen colors range from green (low 
density/flow) to yellow (medium/critical density/flow) to red (high density/flow). In the right 
panels, we show, using the same legend, the results of density and flow reconstruction. As it can 
be observed, the estimation algorithm is able to represent the four congestion events described 
in the previous section in a reasonably good way, given the limited amount of information 
employed; in particular, observe that the two small congestions at Rondeau during the morning 
and at Eybens during early afternoon, when present, are both detected. Notice that the resulting 
estimate remains good and reasonably close to the real profile, despite the absence of flow 
measurements on ramps, which are not negligible, especially in peak hours. 

The performance of the algorithm is quantitatively illustrated via the absolute error between 
measured and estimated flows and densities ([8]). 

The results are reported in Table [TTJ in which we report the maximum absolute error S on the 
75%, 90% and 95% of the pairs (cell, time), for all cells on the main line and all samples during 
a day, for all the considered days, and for both densities and flows. The table shows an average 


July 28, 2015 


DRAFT 















35 


Considered %(t,e ): 

5 : 

75% 

a p (t, e) 

90% 

< 5 

95% 

5 : 

75% 

a v (t, e) 

90% 

< <5 

95% 

2014-04-01 -Tuesday 

10.4317 

19.4079 

32.1121 

2.4901 

3.9824 

4.8838 

2014-04-02 -Wednesday 

10.058 

16.8625 

23.4218 

2.504 

4.0508 

4.9457 

2014-04-03 -Thursday 

14.214 

39.0677 

68.6383 

2.401 

3.8752 

4.765 

2014-04-04 -Friday 

13.3274 

35.6493 

67.2034 

2.5477 

4.0648 

4.9543 

2014-04-05 -Saturday 

7.6795 

11.6078 

14.9678 

2.5927 

4.138 

5.0691 

2014-04-06 -Sunday 

5.6087 

7.8254 

9.5749 

2.1949 

3.5373 

4.4432 

2014-04-07 -Monday 

8.8653 

14.9656 

21.5993 

2.2046 

3.6666 

4.6213 

2014-04-08 -Tuesday 

9.4236 

16.1212 

24.1236 

2.134 

3.5462 

4.4686 

2014-04-09 -Wednesday 

10.7057 

21.1416 

34.5579 

2.1496 

3.5922 

4.5308 

2014-04-10 -Thursday 

10.3578 

21.7147 

38.2241 

2.1529 

3.5922 

4.5125 

2014-04-11 -Friday 

10.1152 

19.0844 

32.2754 

1.8778 

3.2413 

4.1324 

2014-04-12 -Saturday 

5.7334 

8.4439 

11.1728 

2.1883 

3.5934 

4.5191 

2014-04-14 -Monday 

9.5789 

18.4657 

31.2646 

2.1349 

3.5506 

4.5182 

2014-04-15 -Tuesday 

11.2095 

18.9568 

32.7132 

2.0675 

3.6966 

4.6661 

2014-04-16 -Wednesday 

12.9045 

28.6587 

46.8904 

2.2881 

3.9458 

4.9609 

2014-04-17 -Thursday 

17.482 

43.2213 

62.2287 

2.0069 

3.5265 

4.4383 

2014-04-18 -Friday 

14.8889 

33.0979 

48.2054 

2.3525 

4.0964 

5.1424 

2014-04-19 -Saturday 

6.3368 

9.8998 

12.8105 

2.3955 

3.9664 

4.8847 

2014-04-20 -Sunday 

5.1875 

7.1362 

8.7217 

2.0231 

3.2701 

4.0776 

2014-04-21 -Monday 

5.4528 

7.433 

8.9298 

2.1513 

3.4238 

4.2284 

2014-04-22 -Tuesday 

12.0275 

26.3377 

43.2225 

2.2019 

3.7727 

4.7825 

2014-04-23 -Wednesday 

12.9473 

32.5637 

49.0958 

2.0893 

3.4793 

4.4004 

2014-04-24 -Thursday 

18.222 

42.5245 

59.2349 

2.0391 

3.3786 

4.282 

2014-04-25 -Friday 

24.4905 

49.1946 

65.6698 

2.0058 

3.3379 

4.244 

2014-04-26 -Saturday 

6.0991 

8.7496 

10.8339 

1.922 

3.3109 

4.1612 

2014-04-27 -Sunday 

5.4655 

7.4201 

8.93 

2.1341 

3.3638 

4.1301 

2014-04-28 -Monday 

13.8312 

35.9539 

57.5468 

1.9 

3.3551 

4.2986 

2014-04-29 -Tuesday 

9.7221 

17.4268 

25.9634 

2.2904 

3.7966 

4.714 

2014-04-30 -Wednesday 

12.736 

29.0402 

47.1253 

2.2473 

3.7883 

4.7363 

Average 

10.8656 

22.3439 

34.3882 

2.1961 

3.6531 

4.5694 


TABLE II 

Quantitative measurement of the performance of the proposed algorithm. Maximum magnitude of the 
ABSOLUTE ERROR ON DENSITIES (LEFT) AND FLOWS (RIGHT) ON 75%, 90% AND 95% OF THE PAIRS (CELL, TIME) 

BETWEEN 07:00 AND 19:00. 
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error of less than 11 veh/km for the 75% of pairs and less than 23 veh/jm for the 90% of the 
pairs. In 5% of the (cell, time) pairs the error is higher but still less than around 35 veh/km. As 
a comparison, we considered an oracle that knows 

• the exact outflow f° ut (t ) for every cell e and every sample time t (namely, for each 15 
seconds time slot); 

. whether the cell is in freeflow or in congestion, for every cell e and every sample time t. 
The average error obtained by the oracle is 5.8 for the 75% of the pairs (cell, time) and 12.5 
for the 90% of the pairs, which are high even with the high amount of additional (and precise) 
information available to the oracle. Indeed, this confirms that estimation in traffic systems is a 
rather difficult task, and that errors of absolute magnitude around 10-20 veh/km can be acceptable, 
as they capture the qualitative trend features of the traffic system - such as low density, medium 
density, high density. On the other side, estimate of flows are rather low, being less than 1 ~ 2 
vehicles for the 75% of the pairs (cell, time), and less than 2 ~ 3 for the 90% of the pairs. 

The biggest difference between estimated and measured densities can be observed at Eybens 
exit (cells 37-38), where the estimated flow constantly predicts a higher density than the measured 
one. The explanation is however very straightforward: as mentioned, we do not use ramp data 
in order to show the prowess of our method even employing a very small number of sensors. 
On the other hand, as mentioned in the description of the data, the exit of Eybens is a critical 
point whose ramp is selected by a high fraction of vehicles to exit the Rocade towards the town. 
However, the corresponding cells belong to a long FCD segment running from Saint-Martin- 
d’Heres (cells 32-33) to Eybens entrance (cells 41-42), which provides, during peak time, just 
one set of rather low speed measurements, which do not distinguish between the stretch of road 
before Eybens exit (congested and at low speed) against that after Eybens exit (uncongested and 
at high speed). Due to the so obtained low speed measurement, the algorithm tends to estimate 
a high number of vehicles along the whole segment, instead of two different regimes before and 
after the ramp. Analogously, we observe a mismatch between measured and estimated flow in the 
first and last sections, which are due to unobserved flows from onramp and to offramps. A second 
discrepancy is the smoothness of the reconstructed density and flow, as compared with the more 
scattered measurements. The latter is due to the high measurement rate, which during stop-and- 
go phenomena results in measurements which rapidly oscillate between stopped vehicles and low 
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or medium speed. On the converse, the optimization based flow reconstruction and the first order 
mass conservation law for densities have a low pass effect therefore producing smoother, more 
regular patterns. Further research direction will investigate the possibility to detect stop-and-go 
phenomena and reproduce, at least qualitatively, the resulting irregular patterns. 

VI. Conclusions 

This paper addressed the problems of data fusion of heterogeneous sources of information for 
density estimation in Road Transportation Networks and optimal sensor placement via a heuristic 
that we called Virtual Variance algorithm. A gradient descent procedure for the calibration of 
the Fundamental Diagram is also discussed. Efficacy of the proposed solutions is illustrated on 
a regular grid and on the real world scenario of the Rocade Sud in Grenoble. Future research 
directions include and are not limited to estimation of statistical properties of measurement 
noises from real data and development of stochastic models for the relation between flows, 
speed and densities, optimization of the observer’s parameters for minimization of mean-square 
reconstruction error, calibration of the matrix of splitting ratios, and extension of the optimal 
sensor placement strategy to maximize density reconstruction performance. 
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